Efficient graph-based dictionary search and its application to text-image searching
Identifieur interne : 001A38 ( Main/Exploration ); précédent : 001A37; suivant : 001A39Efficient graph-based dictionary search and its application to text-image searching
Auteurs : Simon Lucas [Royaume-Uni]Source :
- Pattern Recognition Letters [ 0167-8655 ] ; 2001.
Abstract
This paper describes a novel method for applying dictionary knowledge to optimally interpret the confidence-rated hypothesis sets produced by lower-level pattern classifiers. This problem arises whenever image or video databases need to be scanned for textual content, and where some of the text strings are expected to be strings from a dictionary. The method is especially appropriate for large dictionaries, as might occur in vehicle registration number recognition for example. The problem is cast as enumerating the paths in a graph in best-first order given the constraint that each complete path is a word in some specified dictionary. The solution described here is of particular interest due to its generality, flexibility and because the time to retrieve each path is independent of the size of the dictionary. Synthetic results are presented for searching dictionaries of up to 1 million UK postcodes given graphs that correspond to insertion, deletion and substitution errors. We also present the initial results from processing real noisy text images.
Url:
DOI: 10.1016/S0167-8655(00)00117-3
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000E43
- to stream Istex, to step Curation: 000E08
- to stream Istex, to step Checkpoint: 001092
- to stream Main, to step Merge: 001B31
- to stream Main, to step Curation: 001A38
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Efficient graph-based dictionary search and its application to text-image searching</title>
<author><name sortKey="Lucas, Simon" sort="Lucas, Simon" uniqKey="Lucas S" first="Simon" last="Lucas">Simon Lucas</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:939DD747377DD9DF8FB2CB122C3C52C8F8D97BD5</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1016/S0167-8655(00)00117-3</idno>
<idno type="url">https://api.istex.fr/document/939DD747377DD9DF8FB2CB122C3C52C8F8D97BD5/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000E43</idno>
<idno type="wicri:Area/Istex/Curation">000E08</idno>
<idno type="wicri:Area/Istex/Checkpoint">001092</idno>
<idno type="wicri:doubleKey">0167-8655:2001:Lucas S:efficient:graph:based</idno>
<idno type="wicri:Area/Main/Merge">001B31</idno>
<idno type="wicri:Area/Main/Curation">001A38</idno>
<idno type="wicri:Area/Main/Exploration">001A38</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Efficient graph-based dictionary search and its application to text-image searching</title>
<author><name sortKey="Lucas, Simon" sort="Lucas, Simon" uniqKey="Lucas S" first="Simon" last="Lucas">Simon Lucas</name>
<affiliation wicri:level="1"><country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Department of Computer Science, University of Essex, Colchester CO4 3SQ</wicri:regionArea>
<wicri:noRegion>Colchester CO4 3SQ</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Royaume-Uni</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="2001">2001</date>
<biblScope unit="volume">22</biblScope>
<biblScope unit="issue">5</biblScope>
<biblScope unit="page" from="551">551</biblScope>
<biblScope unit="page" to="562">562</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">939DD747377DD9DF8FB2CB122C3C52C8F8D97BD5</idno>
<idno type="DOI">10.1016/S0167-8655(00)00117-3</idno>
<idno type="PII">S0167-8655(00)00117-3</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper describes a novel method for applying dictionary knowledge to optimally interpret the confidence-rated hypothesis sets produced by lower-level pattern classifiers. This problem arises whenever image or video databases need to be scanned for textual content, and where some of the text strings are expected to be strings from a dictionary. The method is especially appropriate for large dictionaries, as might occur in vehicle registration number recognition for example. The problem is cast as enumerating the paths in a graph in best-first order given the constraint that each complete path is a word in some specified dictionary. The solution described here is of particular interest due to its generality, flexibility and because the time to retrieve each path is independent of the size of the dictionary. Synthetic results are presented for searching dictionaries of up to 1 million UK postcodes given graphs that correspond to insertion, deletion and substitution errors. We also present the initial results from processing real noisy text images.</div>
</front>
</TEI>
<affiliations><list><country><li>Royaume-Uni</li>
</country>
</list>
<tree><country name="Royaume-Uni"><noRegion><name sortKey="Lucas, Simon" sort="Lucas, Simon" uniqKey="Lucas S" first="Simon" last="Lucas">Simon Lucas</name>
</noRegion>
<name sortKey="Lucas, Simon" sort="Lucas, Simon" uniqKey="Lucas S" first="Simon" last="Lucas">Simon Lucas</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A38 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001A38 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:939DD747377DD9DF8FB2CB122C3C52C8F8D97BD5 |texte= Efficient graph-based dictionary search and its application to text-image searching }}
This area was generated with Dilib version V0.6.32. |